Multi-Source Domain Adaptation for Text-Independent Forensic Speaker Recognition
نویسندگان
چکیده
Adapting speaker recognition systems to new environments is a widely-used technique improve well-performing model learned from large-scale data towards task-specific small-scale scenarios. However, previous studies focus on single domain adaptation, which neglects more practical scenario where training are collected multiple acoustic domains needed in forensic Audio analysis for offers unique challenges with multi-domain due location/scenario uncertainty and diversity mismatch between reference naturalistic field recordings. It also difficult directly employ domain-specific train complex neural network architectures performance loss. Fine-tuning commonly-used method adaptation order retrain the weights initialized well-trained model. Alternatively, this study, three novel methods based adversarial training, discrepancy minimization, moment-matching approaches proposed further promote across domains. A comprehensive set of experiments conducted demonstrate that: 1) diverse do impact performance, could advance research audio forensics, 2) learns discriminative features invariant shifts domains, 3) discrepancy-minimizing achieves effective simultaneously 4) along dynamic distribution alignment significantly promotes each domain, especially LENA-field noise compared all other systems. Advancements shown here therefore helper ensure consistent operational forensics.
منابع مشابه
Text-independent Speaker Recognition
In this paper, text-independent speaker recognition method based on Wavelet Transform and mel-cepstrum is presented. The results of experiments point the best parameters of Wavelet Transform for speaker identification, and can be useful for design speaker identification systems. This kind method of person identification is useful in services such as banking by telephone, access authorization to...
متن کاملTransformation enhanced multi-grained modeling for text-independent speaker recognition
We describe our formulation of transformation enhanced data modeling used to develop a multi-grained data analysis approach to text independent speaker recognition. The broad goal is to address difficulties caused by sparse training and test data. First, our development of maximum likelihood transformation based recognition with diagonally constrained Gaussian mixture models is detailed. We giv...
متن کاملMulti-state predictive neural networks for text-independent speaker recognition
Both Hidden Markov Models and Neural Networks have already been used as production systems for speaker identification or verification. Recently [9] has shown that ergodic multi-state hidden Markov Models do not outperform one-state "hidden" Markov Models, i.e. Gaussian Mixture Models, for speaker recognition. She put in evidence that the important characteristic of these models is the total num...
متن کاملDomain adaptation for text dependent speaker verification
Recently we have investigated the use of state-of-the-art textdependent speaker verification algorithms for user authentication and obtained satisfactory results mainly by using a fair amount of text-dependent development data from the target domain. In this work we investigate the ability to build high accuracy text-dependent systems using no data at all from the target domain. Instead of usin...
متن کاملSpeaker-specific mapping for text-independent speaker recognition
In this paper, we present the concept of speaker-specific mapping for the task of speaker recognition. The speakerspecific mapping is realized using a multilayer feedforward neural network. In the mapping approach, the aim is to capture the speaker-specific information by mapping a set of parameter vectors specific to linguistic information in the speech, to a set of parameter vectors having li...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2022
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2021.3130975